155 research outputs found

    Video Storytelling: Textual Summaries for Events

    Full text link
    Bridging vision and natural language is a longstanding goal in computer vision and multimedia research. While earlier works focus on generating a single-sentence description for visual content, recent works have studied paragraph generation. In this work, we introduce the problem of video storytelling, which aims at generating coherent and succinct stories for long videos. Video storytelling introduces new challenges, mainly due to the diversity of the story and the length and complexity of the video. We propose novel methods to address the challenges. First, we propose a context-aware framework for multimodal embedding learning, where we design a Residual Bidirectional Recurrent Neural Network to leverage contextual information from past and future. Second, we propose a Narrator model to discover the underlying storyline. The Narrator is formulated as a reinforcement learning agent which is trained by directly optimizing the textual metric of the generated story. We evaluate our method on the Video Story dataset, a new dataset that we have collected to enable the study. We compare our method with multiple state-of-the-art baselines, and show that our method achieves better performance, in terms of quantitative measures and user study.Comment: Published in IEEE Transactions on Multimedi

    Multi-keyword multi-click advertisement option contracts for sponsored search

    Full text link
    In sponsored search, advertisement (abbreviated ad) slots are usually sold by a search engine to an advertiser through an auction mechanism in which advertisers bid on keywords. In theory, auction mechanisms have many desirable economic properties. However, keyword auctions have a number of limitations including: the uncertainty in payment prices for advertisers; the volatility in the search engine's revenue; and the weak loyalty between advertiser and search engine. In this paper we propose a special ad option that alleviates these problems. In our proposal, an advertiser can purchase an option from a search engine in advance by paying an upfront fee, known as the option price. He then has the right, but no obligation, to purchase among the pre-specified set of keywords at the fixed cost-per-clicks (CPCs) for a specified number of clicks in a specified period of time. The proposed option is closely related to a special exotic option in finance that contains multiple underlying assets (multi-keyword) and is also multi-exercisable (multi-click). This novel structure has many benefits: advertisers can have reduced uncertainty in advertising; the search engine can improve the advertisers' loyalty as well as obtain a stable and increased expected revenue over time. Since the proposed ad option can be implemented in conjunction with the existing keyword auctions, the option price and corresponding fixed CPCs must be set such that there is no arbitrage between the two markets. Option pricing methods are discussed and our experimental results validate the development. Compared to keyword auctions, a search engine can have an increased expected revenue by selling an ad option.Comment: Chen, Bowei and Wang, Jun and Cox, Ingemar J. and Kankanhalli, Mohan S. (2015) Multi-keyword multi-click advertisement option contracts for sponsored search. ACM Transactions on Intelligent Systems and Technology, 7 (1). pp. 1-29. ISSN: 2157-690

    Music Structure Analysis Statistics for Popular Songs

    Get PDF
    Non

    Fast Yet Effective Machine Unlearning

    Full text link
    Unlearning the data observed during the training of a machine learning (ML) model is an important task that can play a pivotal role in fortifying the privacy and security of ML-based applications. This paper raises the following questions: (i) can we unlearn a single or multiple classes of data from an ML model without looking at the full training data even once? (ii) can we make the process of unlearning fast and scalable to large datasets, and generalize it to different deep networks? We introduce a novel machine unlearning framework with error-maximizing noise generation and impair-repair based weight manipulation that offers an efficient solution to the above questions. An error-maximizing noise matrix is learned for the class to be unlearned using the original model. The noise matrix is used to manipulate the model weights to unlearn the targeted class of data. We introduce impair and repair steps for a controlled manipulation of the network weights. In the impair step, the noise matrix along with a very high learning rate is used to induce sharp unlearning in the model. Thereafter, the repair step is used to regain the overall performance. With very few update steps, we show excellent unlearning while substantially retaining the overall model accuracy. Unlearning multiple classes requires a similar number of update steps as for the single class, making our approach scalable to large problems. Our method is quite efficient in comparison to the existing methods, works for multi-class unlearning, doesn't put any constraints on the original optimization mechanism or network design, and works well in both small and large-scale vision tasks. This work is an important step towards fast and easy implementation of unlearning in deep networks. We will make the source code publicly available
    corecore